
# Calculating overdispersion n=120 k=19 n-k=101
stdres2 <- rstandard(nb5bis)
print(paste("Estimated overdispersion", sum(stdres2^2)/101))
## [1] "Estimated overdispersion 1.62538490690911"
nb5bis.pred <- predict(nb5bis, newdata = data1[120:126, ], type = "response")
paste("RMSE:", sqrt(mean((nb5bis.pred - data1$`New cases/day`[120:126])^2)))
## [1] "RMSE: 1379.56664915407"
#paste("MSE:", mean(nb5bis$residuals^2))
paste("AIC:", nb5bis$aic)
## [1] "AIC: 1435.14403991019"
paste(c("Null deviance: ", "Residual deviance:"),
round(c(nb5bis$null.deviance, deviance(nb5bis)), 2))
## [1] "Null deviance: 1416.29" "Residual deviance: 146.46"
Applying ANOVA to compare the negative binomial models
We decided to compare nb1, nb2, nb5, because they are nested and we are more interested in seeing if the fifth model is in fact better than the first model.
#Applying ANOVA to compare the negative binomial models
anova(nb1, nb2, nb5)
## Likelihood ratio tests of Negative Binomial Models
##
## Response: Cumulative cases
## Model
## 1 Elapsed time
## 2 `Elapsed time` + `Grupo de edad_19_30` + `Grupo de edad_31_45` + `Grupo de edad_46_60` + `Grupo de edad_60_75` + `Grupo de edad_76+`
## 3 `Elapsed time` + `Grupo de edad_19_30` + `Grupo de edad_31_45` + `Grupo de edad_46_60` + `Grupo de edad_60_75` + `Grupo de edad_76+` + `Departamento o Distrito_Bogotá D.C.` + `Departamento o Distrito_Boyacá` + `Departamento o Distrito_Caldas` + `Departamento o Distrito_Casanare` + `Departamento o Distrito_Cauca` + `Departamento o Distrito_Cundinamarca` + `Departamento o Distrito_Meta` + `Departamento o Distrito_Quindío` + `Departamento o Distrito_Risaralda` + `Departamento o Distrito_Santander` + \n `Departamento o Distrito_Tolima`
## theta Resid. df 2 x log-lik. Test df LR stat. Pr(Chi)
## 1 11252780 118 -21905.18
## 2 12919543 113 -20629.27 1 vs 2 5 1275.907 0
## 3 13728063 102 -18042.49 2 vs 3 11 2586.780 0